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ABSTRACT 

The SMc01113/YbeY protein, belonging to the 
UPF0054 family, is highly conserved in nearly every 
bacterium. However, the function of these proteins 
still remains elusive. Our results show that 
SMc01 1 1 3/YbeY proteins share structural similari- 
ties with the MID domain of the Argonaute (AGO) 
proteins, and might similarly bind to a small-RNA 
(sRNA) seed, making a special interaction with the 
phosphate on the 5 -side of the seed, suggesting 
they may form a component of the bacterial sRNA 
pathway. Indeed, eliminating SMc01 11 3/YbeY ex- 
pression in Sinorhizobium meliloti produces symbi- 
otic and physiological phenotypes strikingly similar 
to those of the hfq mutant. Hfq, an RNA chaperone, 
is central to bacterial sRNA-pathway. We evaluated 
the expression of 13 target genes in the smc01113 
and hfq mutants. Further, we predicted the sRNAs 
that may potentially target these genes, and 
evaluated the accumulation of nine sRNAs in WT 
and smc01113 and hfq mutants. Similar to hfq, 
smc01113 regulates the accumulation of sRNAs as 
well as the target mRNAs. AGOs are central compo- 
nents of the eukaryotic sRNA machinery and con- 
ceptual parallels between the prokaryotic and 
eukaryotic sRNA pathways have long been drawn. 
Our study provides the first line of evidence for such 
conceptual parallels. Furthermore, our investigation 
gives insights into the sRNA-mediated regulation of 
stress adaptation in S. meliloti. 



INTRODUCTION 

In a symbiotic event, rhizobial bacteria, e.g. Sinorhizobium 
meliloti, colonize the legume roots to fix biological 
nitrogen. This symbiosis is extremely important not only 
from agricultural, economical and environmental perspec- 
tives, but also the interaction of rhizobial bacteria with its 
legume host is a valuable model system to study chronic 
infection processes (1). The rhizobial bacteria alternate 
between a free living phase in the soil/rhizophere and a 
symbiotic phase within the host plant cells where they dif- 
ferentiate into nitrogen fixing bacteroids (2). Symbiosis 
comprises a complex series of events that begin with the 
infection process, which is initiated with the exchange of 
signals between the bacteria and the plant hosts (3,4). The 
signal components include the NOD factors, secreted 
proteins, polysaccharides, flavonoids, ligand-gated-ion 
channels, the calcium-calmodulin dependent protein 
kinases and calcium spiking (3,5). Invasion of plant 
roots by bacteria involves the development of an infection 
thread and cytoskeleton remodeling. During the invasion, 
bacteria encounter a plant innate immune response- a 
burst of reactive-oxygen-species (ROS) (4). Further 
along, the infection thread is targeted to the cortical 
tissues, where the rhizobia are endocytosed resulting in 
formation of symbiosome, followed by their differenti- 
ation into nitrogen-fixing bacteroids. Adaptation to the 
extracellular environment of the host cells requires lipo- 
polysaccharide (LPS) and non-LPS factors such as those 
needed for coping with ion-stress (3,4). Bacteria enclosed 
within the symbiosome membrane encounter very low 
oxygen levels, where they can express enzymes of the 
nitrogenase complex and begin nitrogen fixation (4). 
Fixed carbon provided by plants in the form of 
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dicarboxylic acids is metabolized to provide the energy 
necessary for nitrogen fixation. 

Large-scale transcriptional, proteomic and metabolic 
rearrangements underlie the complex transition from the 
free-living state to the symbiotic state, as well as the 
changes when the free living bacteria adapt to changing 
environmental conditions (2,6-9). For example, a 
'base-line proteomic output' evaluated proteins from 
rhizobia grown in several conditions: an accumulation of 
a total of 2224 proteins, derived from 810 distinct genes 
and resulting in alteration of 53 metabolic pathways was 
determined (8). Analyses comparing transcription during 
microoxic versus oxic conditions and between free-living 
versus symbiotic states in S. meliloti revealed hundreds of 
genes, including several potential regulators, being differ- 
entially regulated (2). The microoxic and bacteroid tran- 
scriptomes partially overlapped, suggesting complex 
modes of gene regulation in different environments. 
Furthermore, the development of a 'dual-genome symbi- 
osis chip' facilitated identification of changes in approxi- 
mately 5000 transcripts during symbiosis and when 
S. meliloti is grown in different environments (9). Thus, 
complex transcriptional rearrangements occur in 
S. meliloti during its adaptation to environmental 
stresses and symbiosis. How these large scale transcrip- 
tional and proteomic responses are coordinated remains 
largely unknown. The magnitude of responses suggests 
that transcriptional regulation by certain proteins may 
not be the only mechanism of gene regulation. 

RNA silencing has emerged as a fundamental regula- 
tory process affecting many layers of endogenous gene 
expression; non-coding small-RNAs (sRNAs) appear to 
be important regulators of gene expression during stress 
responses in both eukaryotes and bacteria. It is only rela- 
tively recently that the pervasiveness, importance and 
roles of the sRNA-mediated regulation in bacteria have 
become apparent. They modulate transcription, transla- 
tion, mRNA stability and genomic integrity via several 
mechanisms including changes in RNA conformation, 
protein binding, base pairing with target RNAs and inter- 
actions with DNA (10). Most known sRNAs regulate gene 
expression by base pairing with their target mRNAs that 
is facilitated by the RNA chaperone, Hfq; Hfq is 
implicated as a central player in the sRNA-mediated regu- 
lations (11-14). 

Although much is understood about the role of sRNAs 
in modulating gene expression in the model bacteria, 
Escherichia coli and Salmonella (10,15), comparatively 
less is known about the possible role of the sRNA- 
mediated interaction in the model symbiotic bacteria, 
S. meliloti. A few studies have recently appeared that 
use comparative genomics and prediction approaches to 
identify sRNAs in the S. meliloti genome (16-18). 
Although expression of few of the predicted sRNAs was 
verified in these studies, the physiological roles of most of 
these sRNAs, their possible targets and the pathways that 
may be under their control remain largely unknown. In 
our and related recent studies, mutation of hfq resulted in 
S. meliloti that are highly compromised in their symbiotic 
ability (19-22), suggesting that the sRNA-pathways form 
a central component of the gene regulation during 



symbiosis and stress adaptation. Characterization of the 
symbiotically deficient hfq mutant provided important 
clues in understanding the underlying molecular mechan- 
isms of regulation during symbiosis and stress adaptation. 
Nevertheless, hardly any specifics for Hfq-dependent 
sRNA-mRNA interactions are known in S. meliloti. 

A screen that we had conducted to identify oxidative- 
stress-compromised symbiotically deficient mutants led us 
to identify SMcOllH gene as essential for symbiosis 
(23,24). SMcOllH encodes a protein of unknown 
function that is strongly conserved in most bacteria (24); 
Escherichia coli ortholog is named YbeY. The SMcOl 1 13/ 
YbeY proteins belong to the UPF0054 family. While the 
crystal structures of the YbeY/UPF0054 members from 
E. coli, Aquifex aeolicus and Thermotoga maritima have 
been determined and suggest that YbeY is a metallo- 
protein (25-27), the cellular and biochemical function of 
this protein remains elusive. It has been annotated as a 
heat shock protein, as a peptidase or even as a protease 
(25,28,29). Recently, in E. coli, YbeY has been reported 
to be involved in translational regulation during high- 
temperature adaptation, presumably by affecting 
ribosome activity (28,29). 

In the current investigation, we explore the possible 
cellular function of the S. meliloti SMc01113 gene that 
appears to be central to symbiosis. Our sequence and 
structure analysis suggested SMc01113's similarity to a 
domain of Argonaute (AGO) proteins, which are central 
to sRNA mediated interaction in eukaryotes (30,31). 
AGOs are multi-domain-pro teins comprising of N, PAZ, 
MID and PIWI domains: the N- (or the amino-terminal) 
domain projects a basic surface face for putative RNA 
recognition; the PAZ domain contains a specific binding 
pocket that anchors the characteristic 2-nt 3'-overhang 
that results from digestion of RNAs by RNase III (a 
step in the processing of sRNAs); the PIWI domain 
shows extensive homology to RNase H, and these are 
often referred to as 'sheers'; the MID domain resides 
between the PAZ and PIWI domains and binds the char- 
acteristic 5'-phosphates of smRNAs, thus anchoring 
them onto the Ago protein; the MID domain has also 
been implicated in protein-protein interactions (32,33). 
AGO proteins constitute the key components of the 
'RNA-induced-silencing-complex' (RISC) that modulates 
sRNA-dependent gene expression, and are mostly found 
in eukaryotes. This suggested the possibility that 
SMc()1113 may form a component of, or be involved in, 
the interactions mediated by sRNA-pathways. We used a 
comparative approach with hfq as a reference to investi- 
gate this hypothesis. Our results suggest a possible role 
of SMc.01113 in interactions mediated by the 
sRNA-pathway. 

MATERIALS AND METHODS 

Strains, media and culture conditions 

Strains used in this study are detailed in the 
Supplementary Table S2. Sinorhizobium meliloti strains 
were grown aerobically at 30° C in the complex LB 
medium containing 5 ml/1 of 0.5M CaCl 2 and 25 ml/1 of 
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MgS0 4 (LB-MC) to an optical density at 600 (OD 600 ) of 
1.5-2; they were then inoculated in the LB-MC or 
rhizobial minimal medium (RMM) with glucose at an 
OD 6 oo of 0.1 (19,20). For selection of S. meliloti, strepto- 
mycin (500 (ig/ml), gentamycin (50 Lig/ml) and neomycin 
(200ug/ml) were used. Bacto agar was used at 1.5% for 
solid medium. 

Plant nodulation assay and fitness estimates 

Nodulation assays for plants were conducted as previously 
described (19). Briefly, the alfalfa (Medicago saliva cv. 
Iroquois) seeds were surface sterilized and germinated on 
0.8% agar in water in dark for 3 days; seedlings were 
placed on the top of a petri dish with 25 ml Jensen's 
agar, and inoculated with 1 ml appropriate strain of 
S. meliloti grown to saturation in LB-MC. Plates were 
wrapped in aluminium foil and incubated at 25° C for 
4 weeks, after which fitness parameters such as plant 
height, number of pink/white nodules, number of green 
leaves and plant dry weight were recorded. 

Physiological characterization following challenge 
with stresses 

Bacterial response to various stresses for WT, smc01113 
and hfq mutants S. meliloti was evaluated by analysis of 
the growth curves in liquid LB-MC or RMM. The affect 
of mutating SMc01113 on swarming behavior was 
evaluated on swarming plates by measuring the diameter 
of bacterial spreading as previously described (19) and 
compared to the WT and hfq mutant. Similarly, the 
affect of heat shock (50° C for 35^10 min) was evaluated 
on the WT, smc01113 and hfq mutants as previously 
described (19,20). Ability to respond to various other en- 
vironmental stresses was evaluated by growing the WT, 
smc01113 and hfq mutants with or without Paraquat 
(100 mM), MMS (0.025%), ethanol (2.5%), SDS (0.1%), 
NaCl (50 mM) or cefotoxime (50ug/ml) for 48 h, where 
the cells were incubated with shaking at 30°C. Aliquots 
were collected at different time intervals, OD 60 o was 
measured and residual growth was determined (19,20). 

Sequence and structure analysis 

Similarity between the SMc01113, YbeY and AGO 
protein sequences was determined by aligning the se- 
quences with CLUSTALW (http://www.ebi.ac.uk/Tools/ 
clustalw2/index.html). Structural homology search was 
performed using YbeY (PDB ID: 1XM5; chain A) as 
search probe on DALI server (34) and pair-wise compari- 
sons were done using DALIlite (35). Structural similarities 
between the YbeY (1XM5) and the A. aeolicus AGO 
(2NUB) protein were also determined by super-imposing 
the YbeY structure (27) with the structures of complete 
AGO protein and MID/PIWI domain (32) using the 
Crystallographic Object-Oriented Toolkit [COOT; 
(36,37)]. YbeY structure was analyzed by Hotpatch 
server (http://hotpatch.mbi.ucla.edu/) (38) for the DNA/ 
RNA binding properties and for finding functionally im- 
portant patches on its surface. YbeY (PDB ID: 1XM5; 
chain A) and 4mer RNA (generated from PDB ID: 
2BGG, chain R) were docked by using the ClusPro 2.0 



docking server (http://cluspro.bu.edu) in order to model 
YbeY and RNA interactions. A single phosphate (P0 4 ) 
ion was manually docked in COOT (36,37). Electrostatic 
surface calculation and all figures were made by PYMOL. 
Sequences homologous to SMcOl 113/ YbeY in other 
bacterial species were extracted after performing a non- 
redundant global BLAST search with the SMc01113. 
Homologous sequences were extracted from NCBI, 
multiple alignments were made and phylogenetic tree 
was constructed using CLUSTALW. Similarly, homolo- 
gous sequences of the S. meliloti sRNAs were searched 
in closely related species (S. medicae, Rhizohium sp. 
NGR234, R. leguminosarum and R. etli) using the 
'genome BLAST' tool at the NCBI (http://www.ncbi 
.nlm.nih.gov/sutils/genom_table.cgi). The homologous 
sRNA sequences were extracted, multiple alignments 
were made and phylogenetic trees were constructed 
using CLUSTALW. 

Expression analysis by quantitative real-time PCR 

The smc01113 and hfq mutants and the WT S. meliloti 
were grown aerobically at 30° C in the LB-MC medium 
to an OD 60 o of 1.5-1.8; they were then inoculated in 
RMM with 0.5% glucose, and grown for 24 or 48 h, 
after which the cells were harvested by centrifugation. 
Total RNA was isolated by TRI reagent (T 9424; 
Sigma, St Louis, MO, USA) using manufacturer's 
protocol. SYBR green assays were developed using 
Express One-step SYBRGreenER kit (#11780-200; 
Invitrogen; http://www.invitrogen.com) following the 
manufacturer's protocol. All the quantitative real-time 
PCR (qPCR) assays were performed with 50 ng total 
RNA. Three to four biological replicates were used for 
each genotype; each biological replicate was used twice 
on the qPCR plate. The 2" AACT method (relative quanti- 
tation) was used for data analysis. Briefly, the C t (thresh- 
old cycle) values of target genes were normalized to the 
endogenous reference gene (AC, = C, target- Q reference) 
and compared with those values obtained from the cali- 
brator (A AC, = AC t sample" AC, calibrator)- To simplify 

data interpretation, expression levels in control samples 
(WT, Rml021) were fixed to 1 and relative expression 
levels were calculated with respect to this reference value 
in the smc01113 and hfq mutants. A list of primers used in 
this study is available in the Supplementary Table S3. All 
the gene-specific primers were designed with Primer3 
software (http://frodo.wi.mit.edu/primer3/), and all the 
assays were run on the 'AB7500 fast real-time PCR 
system' (Applied Biosystems; http://www 
.appliedbiosystems.com/absite/us/en/home.html). 

Bioinformatic analysis of sRNAs-target interactions 

In order to determine if the S. meliloti sRNAs targeted 
the genes differentially regulated in the smc01113 and 
hfq mutants, and informed by the fact that sRNAs 
regulate the gene expression by mostly base pairing with 
their target mRNA sequences, we used the concept of 'seed 
pairing', already used in some other bacteria (39-42). 
Complete coding sequences, obtained from the Rhizobase 
database (http://genome.kazusa.or.jp/rhizobase/), were 
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used for genes differentially regulated in the smc01113 and 
hfq mutants. Using custom-written perl scripts, 'seed-pair 
analysis' of sRNAs was conducted for the target genes: 
small 'seed' sequences, >7nt, starting from the first 
as well as the second nucleotide at the 5'-end of the 
sRNA, with a floating window of 1 nt, were generated. 
Then, sequences from the 3'-ends of the target mRNA 
was compared for perfect complementarities with 
sRNA-seed-sequences from the 5'-ends. These seeds were 
mapped to the sequences of the target genes for 'Watson- 
Crick' base pairing, and a hit-map was generated 
(Supplementary Table SI). Most /raws-acting sRNAs 
have the seeds >9nt (10,13,43), therefore we used a 
length of 10 or more perfectly matching nucleotides as 
a cut-off. To test the statistical significance, similar 
analysis of predicting sRNA-complementary motifs was 
performed on a set of unrelated and randomized mRNA 
sequences [from plants; (44)]. Significantly low number 
of hits were obtained for seeds >10nt (Fisher's exact test, 
P<0.05). 

In order to determine the conservation of the 
sRNA-target interaction sites for specific sRNAs and 
target mRNAs, we used sequences from three closely 
related species: 5. meliloti, S. medicae and Rhizobium sp. 
NGR234. We extracted the longest 'seed' sequence 
from the target as well as sRNA. Sequences homologous 
to these 'seed sequences' were extracted from the afore- 
mentioned multiple alignments of the sRNAs and 
targets. Separate conservation profiles for sRNAs and 
targets were generated using the 'WebLogo' program 
(45,46). It is advantageous to use this methodology of 
determining the conservation as it gives frequencies of 
conservation of a given nucleotide at a particular 
position, consensus sequence and the total sequence 
conservation. 

Over-expression of sRNAs 

For over-expression of sral6 and sra35, genomic DNA 
corresponding to the appropriate sra coding sequence 
was PCR amplified from isolated S. meliloti Rml021 
genomic DNA and with engineered 5'-Xho/ and 3'-Kpn/ 
restriction sites. These DNA fragments were then cloned 
into Xho//Kpn/-digested pMS03 (47) under the control 
of the constitutive Trp promoter. Transfer of 
pMS03-sral6 and pMS03-sra35 into S. meliloti (SRA16 
and SRA35) was achieved by tri-parental mating 
followed by selection of S. meiloti harboring the plasmid 
on LB/MC containing 100(ig/ml spectinomycin and 
500(ig/ml streptomycin. WT Rml021 strains were also 
grown with streptomycin. Over-expression of sral6 and 
35, and its effects on their predicted targets were evaluated 
by qPCR analyses after growing the cells to late exponen- 
tial/early stationary phase. We evaluated the affect of 
pMS03 vector in the Rml021 by similarly generating 
control Rml021 strains with empty vector. No significant 
difference in growth or gene-expression was observed 
between the WT and the control WT strain harboring 
empty vector, which confirmed that presence of the 
vector had no effects on bacteria (data not shown). 
Similarly, over-expression of sra35 had no affect on 



growth of 5. meliloti; sral6-over-expressing strain grew 
marginally better than WT or WT with empty vector 
controls (data not shown). 

Yeast two-hybrid analysis 

For cloning, XL 1 Blue E. coli were plated on LB media 
supplemented with 100|ig/ml of ampicillin and incubated 
at 37°C. Yeast were grown on synthetic complete 
medium containing 2% dextrose (SCD) and lacking the 
appropriate nutrient (48). Plates were supplemented with 
2% agar. Yeast cultures and plates were incubated 
at 30°C. 

Plasmids were constructed by PCR amplification of 
the targeted gene from E. coli genomic DNA (strain 
MC4100) using oligonucleotides containing homology to 
the gene of interest flanked by either EcoRl for the 
5'-primer or Bglll for the 3'-primer (Supplementary 
Table S3). The 5'-primer lacked homology to the start 
codon of the amplified ORF to insure correct fusion to 
the Gal4 binding domain (BD) or activation domain 
(AD). After amplification, the amplicon was digested 
with EcoRl and Bglll, and ligated to EcoRl and 
5^/II-digested pGBD-C2 and pGAD-CD plasmids 
(Supplementary Table S2). Correct clones were confirmed 
by DNA sequencing. 

To check for protein interactions, the budding yeast 
strain PJ69-4A (49) (MATa trp 1-901 leu2-3, 112 ura3-52 
his3-200 gal4A gal80A LYS2::GAL1-HIS3 GAL2-ADE2 
met2::GAL7-lacZ) was transformed sequentially using a 
standard lithium acetate protocol (50). In the first 
round, PJ69-4A was transformed with plasmids 
pGBD-C2, pM122, pGAD-C2 and pM130 selecting for 
either Trp + (pGBD-C2, pM122) or Leu + transformants 
(pGAD-C2 and pM130) on SCD-Trp or SCD-Leu, re- 
spectively. Colonies from the first round were subsequent- 
ly transformed with the second plasmid containing the 
putative interacting protein and selected for on SCD 
Leu-Trp. Finally, multiple Leu + Trp + colonies from each 
transformation were pooled and resuspended in 100 (.il of 
ddH 2 0. Ten microliter of the resuspended colonies were 
spotted on SCD-Leu-Trp to confirm growth and SCD 
Leu-Trp-Ade to select for protein-protein interactions. 
Plates were incubated at 30° C for 5 days before interpret- 
ation of data. 

RESULTS 

SMcOllO shows sequence and structural similarities to 
the AGO proteins 

SMc01113 homologs are found in nearly every bacterium 
including symbiotic and pathogenic bacteria 
[Supplementary Figure SI; (24,27)]. SMc01113 and its 
E. coli homolog, YbeY, show amino acid similarity of 
>70% (Figure 1A). Expression of the E. coli homolog 
complements the symbiotic defects of the smc01113 
mutant of S. meliloti (24,51). Sequence and structural 
similarities of the YbeY to AGO protein were readily 
apparent when we performed homology studies with 
the YbeY. Sequence alignment of SMcOl 1 13/YbeY 
showed similarities with AGO protein sequences from 
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YbeY_Ecoli 
SMc01113 



MSQVILDLQLACEDNSGLPEE SQFQTWLNAVIPQFQEE SEVTIRWDT 4 8 

- -MTALDIQISVEAGDWPPEDELQSFCERVLEAAADFLAREENQPLPAQAAELSLVFTDD 58 
**.*..* **. * . * . * . * * 



YbeY_Ecoli AESHSLNLTYRGKDKPTNVLSFP-FEVPPG-MEMSLLGDLVICRQWEKEAQEQGKPLEA 106 
SMcO 1 113 QS IRAINAEWRGQDKATNVLSFPAFPVTPGRMPGPMLGDIWAHETLRREAAELEKPFDA 118 
...* .**.** ******* * * ** * .***.* ** * **..* 

YbeY_Ecoli HWAHMWHGSLHLLGYDHIEDDEAEEMEALETEIMLALGYEDPYIAEKE- 155 

SMC01113 HLTHLLVHGFLHLFGYDHIEDDEAERMEGLETRILARLGLSDPYGDQPPH 168 

* .*..*** ***.*********** ** *** *. ** *** 



B 

Mid_AaAgo 
YbeY Ecoli 



NLRKFLELCRPFVKKDVLS - - VE 1 1 SVSVYKKLEWRKE - EFLKELINFLKNKGIKLKIKG 5 7 
MSQVILDLQLACEDNSGLPEESQFQTWLNAVIPQFQEESEVTIRWDTAESHSLNLTYRG 60 



Mid_AaAgo 
YbeY Ecoli 



K SLILAQTREEAKEKLIPVINKIKDVDLVIVFLEEYPKVDPYKSFLLYDFVKRELLK 114 

KDKPTNVLSFPFEVPPGMEMSLLGDLVICR- QWEKEAQEQGKPLEAHWAHMWHGSLH - 118 



. * * 



Mid_AaAgo 
YbeY Ecoli 



KMIPSQVILNRTLKNENLKFVLLNVAEQVLAKTGNIPYKL 154 
-LLGYDHIEDD- -EAEEMEALETEIMLALGYEDPYIAEKE 155 
.*• .*.... . . * * 




Figure 1. Sinorhizobiwn meliloti ORF, SMc01113jybey, shares similarities with Ago protein. (A) Alignment of the protein sequences of the 5. meliloti 
and E. coli homologs, SMc01113 and YbeY. (B) Alignment of protein sequences of the YbeY and Ago-Mid domain. Structural alignment of 
Ago-Mid domain (red) and YbeY (green) from N. crassa (C) and A. aeolicus AGO (2NUB) proteins (D). Manually docked P04 shows potential 
interactions with the Arg59, K61 residues (E) in the positively charged cavity (F). (G) The lowest energy docked 4mer RNA (Cluspro 2.0 web server) 
onto YbeY surface fits nicely into the protein cavity, with negatively charged RNA backbone aligned towards the positively charged protein surface. 



Neurospora crassa, A. Aeolicus and other species. In par- 
ticular, SMc01113/YbeY showed sequence similarity with 
the MID domain of the AGO protein (Figure 1). The MID 
domain contains the binding site for the 5'-end of the eu- 
karyotic sRNAs, by binding to the P0 4 group of a nucleic 
acid. The structure of E. coli YbeY (27) has been solved, as 



have the structures of AGO proteins from A. aeolicus and 
the Thermus thermophilus (32,33). Recently, the crystal 
structure of a eukaryotic MID-domain, recognizing the 
5'-terminal phosphate of a guide RNA, has also been 
solved, and it highly resembles the previous structures of 
MID domains from other species (52). 
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A structural homology search with YbeY (PDB ID: 
1XM5) by using the DALI server (34) against the full 
PDB database and PDB90 (34) resulted in multiple 
hits corresponding to a wide variety of domains with 
RNA-binding, enzymatic and regulatory functions. 
Among all the DALI hits, the MID domain of N. crassa 
AGO (PDB code: 2XDY), Pyrococcus furious AGO (PDB 
code: 1726) and the MID domain from human AGO 
(PDB ID: 3LUJ) show statistically significant Z-scores 
of 3.6, 3.2 and 2.9, respectively. Further, a pair-wise struc- 
ture alignment using DAliLite (35) yielded a significant 
Z-score of 4.7 for YbeY (PDB ID: 1XM5) and the 
N. crassa AGO MID domain (PDB ID: 2XDY). Indeed, 
there are 83 equivalent C-alpha positions between YbeY 
and the N. crassa AGO MID domain with a root 
mean square deviation (RMSD) of 3.4 A (Figure 1C). 
Further, we tested the structural similarities of YbeY 
with the MID-domain of the A. Aeolicus AGO protein; 
structural superimposition showed RMSD of 2.6 A for 
72 aligned residues indicating structural similarity 
(Figure ID). 

Encouraged by these results, we further explored the 
potential of YbeY to bind RNA. YbeY and a 4mer 
RNA (52) were docked by using the fully automated, 
web-based program ClusPro version 2.0 (53,54) with 
default parameters (Figure 1E-G). The ClusPro docking 
server yielded 69 top-scoring solutions based on the 
balanced, electrostatic-favored, hydrophobic-favored 
and van der Waals plus electrostatic-favored coeffi- 
cients (53,54). The 4mer RNA fits nicely into the protein 
cavity (Figure 1G) and the negatively charged RNA 
backbone is aligned towards the positively charged 
protein surface of the cleft. Most strikingly, in the 
top-scoring model (Figure 1E-G) the free P0 4 group of 
RNA interacts with the Arg59, Lys61 (Figure 1E-F). The 
docking results suggest a probable RNA binding site 
in the YbeY protein. We further subjected the YbeY struc- 
ture to analysis of surface binding properties by Hotpatch 
webserver (38). HotPatch finds unusual patches on the 
surface of proteins, and statistically computes how 
unusual they are (patch rareness), and how likely each 
patch is to be of functional importance [functional confi- 
dence (FC)]. The statistical analysis is done by comparing 
protein's surface against the surfaces of a large set 
of proteins whose functional sites are known. We 
analyzed the YbeY structure to find surface patches with 
putative DNA/RNA binding properties. DNA/RNA- 
binding refers to proteins that interact with oligonucleo- 
tides either catalytically or non-catalytically. Interestingly, 
Hotpatch identified three patches with DNA/RNA 
binding properties and they fall in the same protein 
cavity where we have docked the RNA 4mer 
(Supplementary Figure SIC and D). A fourth functionally 
important patch, corresponding to a His-triad (HI 14, 
HI 18, H124) and possibly coordinating a metal ion, was 
identified (Supplementary Figure SID). This may provide 
a hydrolytic function to YbeY. These observations raised 
the intriguing possibility that SMc01113/YbeY may be 
involved in the process of sRNA-mediated interactions, 
as are the AGO proteins. 



The S. meliloti smcOHH mutant shows symbiotic and 
physiological phenotypes highly similar to those of hfq 
mutant 

Parallels between the prokaryotic and eukaryotic sRNA 
machinery have been noted (13), with the RISC complex 
serving as the protein scaffold for pairing of the sRNA 
with its target in eukaryotes, while the Hfq protein has 
been proposed to play a conceptually analogous role in 
bacteria. Members of biological pathways deregulated 
when the RNA chaperone, Hfq, is mutated, are most 
likely targets of sRNA-machinery (15). We reasoned 
that if SMc01113 is involved in sRNA-mediated inter- 
actions, physiological and molecular loss-of-function 
phenotypes of smc01113/ybey should be comparable to 
those of other established components of sRNA- 
pathway such as hfq. To begin with exploring the hypoth- 
esis that SMc01113 might participate in the processes 
related to sRNA-mediated interactions, we compared the 
smcOUD mutant phenotypes to that of an hfq mutant. 
Loss of Hfq (19) as well as SMc01113 (24) makes 
5". meliloti similarly symbiotically deficient, resulting in 
high fitness costs for both the interacting partners 
(Supplementary Figure S2). Swarming is an important 
multicellular phenomenon characterized by the 
coordinated and rapid movement of bacteria across 
semisolid surfaces, which may help the bacteria to adapt 
to extra- and intra- cellular niches (55). Both, hfq and 
smcOlin mutants were equally defective in their ability 
to swarm (Figure 2A, Supplementary Figure S3). 

We evaluated the susceptibility of the hfq and smc01113 
mutants to several stresses (Supplementary Figure S3) and 
observed that they displayed strikingly similar phenotypes 
(Figure 2). Escherichia coli YbeY has been implicated with 
adaptation to heat-shock (29); analogously, the loss of hfq 
or smc01113 makes S. meliloti similarly susceptible to heat 
shock (Figure 2). Further, the smc01113 and hfq mutants 
displayed similar sensitivities to agents like SDS and 
ethanol that may change the surface potential of the mem- 
branes and affect cellular osmoregulation (Figure 2). 
Resistance to oxidative stress, resulting in generation of 
reactive-oxygen-species is critical to 5. meliloti (56). We 
evaluated the sensitivity of smcOll 13 and hfq mutants to 
oxidative stress (Paraquat): both were similarly sensitive 
(Figure 2). We have previously reported that both the 
smc01113 mutant (24) and the hfq mutant (20) are sensi- 
tive to hydrogen peroxide. The smc01113 and hfq mutants 
were similarly sensitive to a DNA damaging agent, methyl 
methanesulfonate (MMS) (Figure 2). Additionally, we 
tested the sensitivity of the two mutants to the P-lactam 
antibiotic Cefotaxime, which affects bacterial cell wall 
synthesis, and to elevated salt concentration. Both the 
mutants were similarly susceptible to Cefotaxime treat- 
ment, but not to NaCl (Figure 2). Taken together, these 
observations indicate it is plausible that Hfq and 
SMc01113 play roles that are physiologically related. 

SMcOllO and Hfq regulate similar molecular targets 

To test this hypothesis of functionally related roles for 
these two proteins, we assayed mRNA accumulation 
patterns in the smc01113 mutant for the genes that are 
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Figure 2. Loss of hfq or SMc01113 makes S. meliloti similarly sensitive to environmental stresses. (A) smc01113 mutant swarms as less efficiently as 
the hfq mutants than the wild-type strain. Strains were plated on the swarming plates (0.3% agar) and incubated at 30°C. Results of swarming for 
the three strains were evaluated by measuring colony progression every day for 3 days. (B and C) show sensitivity of the smc01113 and the hfq 
mutants towards variety of stresses. Strains were grown with or without stress agents as described in the text and OD 600 were measured, and percent 
residual growth (B) after 24 and 48 h (for paraquat, MMS, ethanol and NaCI) or (C) 48 h (for heat shock, SDS and cefotaxime) was calculated. 
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differentially regulated in the hfq mutant and observed 
striking overlaps (Figures 3 and 4). We recently identified 
proteins significantly affected in their expression in the 
S. meliloti hfq mutant (20). Of the 55 differentially 
accumulated proteins in the hfq proteome, 30 are 
annotated genes. Furthermore, we found that the 
activity of KatB and SodC was reduced in the hfq 
mutant compared to a wild-type (WT) strain (20). In 
order to understand how loss of SMc01113 affected the 
transcriptional accumulation of the targets that are under 
Hfq control, we performed qPCR analyses for 13 of the 
genes (namely, sodC, katB, agpA, atpD, ppiA, ehuB, 
cysKl, HvK, dppAl, dppA2, aapJ, gin A and thtR), which 
are associated with diverse processes such as transport and 



oxidative/superoxide resistance. Most of these genes have 
been confirmed to show differential accumulation in hfq 
mutants in independent studies, in S. meliloti, E. coli, 
Vibrio cholerae and Salmonella (20-22) at RNA or 
protein levels in independent studies compared to WT 
Rml021 strain, accumulation of eight of these genes was 
similarly down-regulated (Figure 3) and four similarly 
up-regulated (Figure 4) in both mutants, whereas one 
gene showed a differential accumulation pattern between 
the hfq and smc01113 mutants. 

Compared to the WT Rml021 strain, the genes 
involved in protection from oxidative/superoxide stresses 
such as katB (catalase B), sodC (superoxide dismutase C), 
cysKl (Cysteine synthase A) were down-regulated in both 
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Figure 3. Loss of hfq or SMc01113 similarly down-regulates target transcript accumulation. WT and the smcOllli and the hfq mutants were grown 
in the RMM till exponential (ppiA, katB and cysKl) or stationary phase (sodC, agpA, HvK, ehuB and atpD), RNA was extracted and qPCR was 
performed. Transcript accumulation in the WT strain was fixed to 1 and relative transcript abundance to WT in the two mutants was evaluated. 
'Single' and 'double asterisk' shows significant difference at P<0.05 and P<0.01, respectively. 
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Figure 4. Up-regulated target transcript accumulation in the S. meliloti mutated for hfq or SMc01113. WT and the smc01113 and the hfq mutants 
were grown in the RMM till stationary phase, RNA was extracted and qPCR was performed. Transcript accumulation in the WT strain was fixed to 
1 and relative transcript abundance to WT in the two mutants was evaluated. 



the smc01113 and hfq mutants (Figure 3). Similarly, the 
expression levels of agp A [a-galactoside-binding protein; 
periplasmic binding protein gene required for the utiliza- 
tion of sugars (57)], atpD (F0F1 ATP synthase beta 
subunit), ppiA [peptidyl-prolyl cis-trans isomerise; 
involved in proper folding of newly synthesized proteins, 
resistance to antibiotics and in cell cycle (58,59)] and ehuB 
[Ectonine binding protein; belonging to ATP-binding 
cassette transporters, a central to osmoprotection under 
hyper-osmotic stress in S. meliloti and provides substrate 
specificity (60,61)] were down- regulated in both the 
mutants, when compared to WT Rml021 (Figure 3). 
Somewhat surprisingly, the expression of livK (which 
was up-regulated in the hfq proteome) was transcription- 
ally down-regulated in both the smcOHH and hfq mutants 
(Figure 3). 



Transporters involved in peptide [dppAl and dppA2; 
(62)], amino acids and solute transport [aapJ; (63)] were 
up-regulated in both of the smc01113 and hfq mutants 
when compared to their transcript levels in WT Rml021 
(Figure 4). Similarly, glnA (glutamine synthetase 1, a key 
enzyme in nitrogen assimilation) was up-regulated in both 
the mutants (Figure 4). However, thtR (a sulfotransferase) 
mRNA showed a difference in accumulation pattern 
between the hfq and smc01113 mutants. No difference in 
mRNA levels was observed between the WT and hfq 
mutant, whereas smc01113 mutant showed enhanced 
thtR mRNA levels, compared to WT and hfq mutant 
(Figure 4). Taken together, these observations suggest 
that bacterial SMcOl 1 13/YbeY proteins play regulatory 
roles that are highly related, but not completely identical 
to, those played by the RNA chaperone, Hfq. 
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In silico identification of possible sRNAs and their binding 
sites in the target genes differentially regulated in hfq 
and smc01113 mutants 

Above-mentioned results prompted us to search for the 
sRNAs that may regulate the altered expression of the 
above said targets in hfq and smc01113 mutants. Our hy- 
pothesis suggested that we would observe similar patterns 
of accumulation of sRNAs between the hfq and smc01113 
mutants. We, therefore, undertook a computational 
approach to identify possible sRNAs that might target 
the 13 genes tested above. 

sRNA-mediated regulation of gene expression falls into 
two broad categories (10). First is by the cw-encoded 
anti-sense sRNAs, which have extensive complementarity 
(>75bp) with their targets, and second, by the trans- 
encoded sRNAs, which have more limited complementar- 
ity. The ?ra«s-encoded sRNAs have 10-25 bp complemen- 
tary 'seed sequences' to their target mRNAs in 
discontinuous patches, where only a core of nucleotides 
seems to be critical for regulation. These sRNAs regulate 
translation by targeting regions in the 5' UTRs, inside and 
upstream of the ribosome-binding sites (RBS)/Shine- 
Dalgarno sequences, occluding or sequestering RBS, as 
well as binding within the coding sequences of the 
targets largely leading to their degradation (13,43,64,65). 
The ^raws-encoded sRNAs are functionally analogous to 
eukaryotic microRNAs (miRNAs), often require Hfq for 
their action (11,13), may pair targets in the coding region 
(43) and thus could be the preferred mode of regulation of 
the differentially accumulated targets identified in hfq and 
smc01113 mutants. The principle of seed pairing has been 
extensively and effectively used for determining sRNA 
mRNA interaction in eukaryotes (39,41,44,66,67) as well 
as in some prokaryotes (40,42). In general, the longer the 
seed sequence (and thus the greater the perfect comple- 
mentation), the higher the chances of an effective 
sRNA-target interaction. 

Informed by these considerations, we performed a bio- 
informatic analysis to identify 5. meliloti sRNAs (17) that 
could potentially bind to the genes differentially regulated 
in the hfq and smc01113 mutants. Our analysis revealed 
that these genes have a large number of potential sRNA 
binding sites (Supplementary Table SI). Genes with 
multiple sRNA-binding sites of >10nt, for specific 
sRNAs were identified; e.g. dppAl had three different 
complementary sites for the sRNA, sral6 (Figure 6B). It 
was strikingly apparent that a single sRNA has the poten- 
tial to target more than one gene, and that a single 
gene can be targeted by several sRNAs (Supplementary 
Table SI; Figure 6A). For instance, sodC had a 13-nt 
sRNA complementary site for one sRNA, 12-nt seeds 
for five sRNAs, and 11-nt seeds for eight sRNAs 
(Supplementary Table SI). The potential for single 
sRNA to target multiple mRNAs is also striking. For 
example, our analysis indicated that sra35 potentially 
targets three genes, sra03 may target six genes and sral6 
may target 10 genes (Supplementary Table SI, Figure 6A). 
Similar observations of 'one-to-many' sRNA-mRNA 
interactions, common in eukaryotes [e.g. (39)], have been 
previously reported in some other bacterial species, and 



would have profound effects on cellular physiology and 
stress adaptation (43,65,68-70). Using a combination of 
such criteria of longest seed length and potential to target 
several genes, we selected 9 sRNAs for their expression 
analysis in WT, a smc01113 mutant and an hfq mutant. 

Expression of sRNAs is similarly affected in S. meliloti 
smcOHH and hfq mutants 

Using qPCR analyses, we next investigated the relation- 
ship between changes in sRNA accumulation in smc01113 
and hfq mutants, and the changes in the various potential 
mRNA targets. Of the nine sRNAs tested, the accumula- 
tion of eight sRNAs (sras 03, 11, 16, 45, 47, 51, 63 and 65) 
was similarly reduced in both the hfq and smc01113 
mutants compared to the WT Rml021 (Figure 5). In 
contrast, sra35 was strongly up-regulated in both hfq 
and smc01113 mutants relative to the WT Rml021 
strain (Figure 5). The pattern of the variation of sRNAs 
relative to particular mRNA target indicated both, inverse 
correlations (up-regulation of target corresponding to 
down-regulation of the sRNAs and vice versa) as well as 
direct correlations (change of target and the targeting 
sRNAs in the same direction; Figure 6A). 

sRNA-mediated-interactions in bacteria results in both 
the inhibition and the stimulation of translation 
(10,13,43,64,65,69). Inverse correlations of expression of 
sRNAs and their potential targets (red/green regions in 
Figure 6A) suggest inhibitory action of these sRNAs on 
their targets; conversely similarly-regulated sRNA-target 
pairs (yellow regions, Figure 6A) indicate a stimulatory 
action of sRNAs in these interactions. Thus, these 
patterns suggest the existence of complex interactions 
between the sRNAs and their targets that are necessary 
for modulating and fine tuning the expression of genes to 
optimum levels. 

To gain further insight into the biological relevance of 
the putative sRNA-target interactions described above 
(Figure 6 A, Supplementary Table SI), we performed 
phylogenetic comparisons of the sRNAs and sRNA- 
target-motifs in mRNAs found in our in silico analysis 
in multiple rhizobium species. Homologous sequences to 
several of these differentially regulated sRNAs were 
identified in three or more related Rhizobium species 
(Supplementary Figures S4 and S5), suggesting their 
wide organismic relevance in symbiosis. Some of these 
sRNA sequences have been reported in other 
a-proteobacterial species (17). Conservation in mRNA- 
targets with complementarity to the seed (2-7 nt) of the 
miRNAs in closely related species has been used as one of 
the main criteria of identifying miRNA-target interactions 
in eukaryotes (41,71-73). On the other hand, sRNAs/ 
targets in viruses are not well conserved (74). When we 
performed an analysis of conservation of the complemen- 
tary sites in the mRNA as well as in sRNAs in the three 
closely related rhizobial bacteria, we found that the motifs 
of sRNAs as well as their complementary sequences in the 
targets were fairly well conserved (Figure 6C). SNPs in 
both sRNAs as well as mRNAs in other rhizobial 
species were seen (Figure 6C). Single-base substitutions 
from T to G/C (and vice versa) were seen in few cases. 
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Figure 5. Loss of or SMc01113 similarly deregulates sRNA accumulation. The WT, smcOH13 and hfq mutants, were grown in the RMM till 
exponential (sras03, 11, 16, 45, 47, 51 and 65) or stationary phase (sraS5 and sra63), RNA was extracted and qPCR was performed. Transcript 
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'Single' and 'double asterisk' shows significant difference at /><0.05 and P<0.01, respectively. 



Such mismatches in these sRNA-mRNA motifs may be 
analogous to eukaryotic miRNA target sites with seeds 
having G:U base pairs or single nucleotide bulges. These 
results suggest a large degree of conservation of these sites 
among closely related rhizobia and are thus interesting 
targets for future study of sRNA-mRNA interaction in 
rhizobia species. 

Over-expression of sRNAs 

In order to evaluate our predictions of sRNA-targets, we 
over-expressed sral6 and sra35 (Figure 7A and B) and 
evaluated the effects on the targets we initially predicted 
(Figure 6). Sral6 was down-regulated whereas sra35 was 



up-regulated in hfq- and smc01113- mutants, compared to 
WT Rml021 (Figure 5). We tested the accumulation of 4 
of the 10 predicted targets of sral6 and all the three pre- 
dicted targets of sra35 (Figure 6). Putative target sites for 
sral6 and sra35 did not overlap with the ribosome binding 
sites. Three sral6-targets {glnA, dppA2 and katE) were 
down-regulated in the sral6-over-producing strain 
(SRA16; Figure 7C, E and G) compared to WT, 
whereas the fourth (dppAl) did not show a difference 
between the two strains (data not shown). Over-expression 
of sra35 (SRA35; Figure 7) resulted in up-regulation of 
glnA (Figure 7D), whereas two other targets (ppiA and 
atpD) showed a trend of down-regulation (Figure 7F 
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Figure 6. sRNA-target relation in S. meliloti and conservation of motifs in sRNAs and their possible targets in other related rhizobium species. 
(A) Correlation between the direction of accumulation of the sRNAs and their target in the S. meliloti strains mutated for hfq and SMc01113, as 
derived from Figures 3-5. Red and green shows inverse correlation, and yellow shows that the sRNAs and their predicted targets show accumulation 
in the similar direction. (B) Schematic representation of the three sral6 binding sites in the dppAl gene. Sral6 was down-regulated whereas dppAl 
was up-regulated in the hfq and smc01113 mutants. (C) Conservation of the target-sRNA motif sequences in the three related rhizobium species. 
Weblogos were generated as described in the text. 



and H). These results show that our predictions were iden- 
tifying biologically relevant sRNA-mRNA interactions 
and that individual sRNAs have multiple targets. Srasl6 
and 35 differently regulated the accumulation of a 
common target, glnA: whereas glnA was down-regulated 
in bacteria over-expressing sral6, over-expression of sra35 
up-regulated glnA expression, providing evidence that 
multiple sRNAs may regulate a given target differently, 
and that this regulation of expression of genes by sRNAs 



is complex. Thus, we were able to largely confirm the 
results of our predictions of sRNA-targets as well as 
target-sRNA relationships. 

Potential interactions of SMc01113/YbeY with other 
components of sRNA machinery/RNA-degradation 

Since our data suggested that the function of SMc01113/ 
YbeY is related to Hfq, we next asked whether SMcOl 113/ 
YbeY physically interacts with the Hfq protein or with 
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other members of the sRNA machinery and RNA- 
degradation. The sRNA-machinery and the components 
of RNA-degradosome are well characterized in E. coli 
(75). RNase E (rne) has been implicated as an integral 
component of the sRNA-machinery and degrades targets 
of sRNAs (10,13,76,77). RNase III (rnc), whose eukary- 
otic analogs are the DICER/dicer-like (DCL) proteins, 
participates in sRNA-mediated interaction, along with 
the RNase E, by performing initial cleavage (78,79). The 
PIWI domain of the AGO proteins has the RNase 
H- (rnhB) type of catalytic domain (32). The two other 
RNases comprising important component of the RNA- 
degradosome during stress adaptation are RNase R and 
PNPase; both of these have overlapping functions (80). 
Informed by the fact that sRNAs can determine which 
proteins bind to Hfq (81), we asked if we could detect 
an interaction of YbeY with Hfq and other RNases in 
the absence of bacterial sRNAs. We therefore conducted 
a directed yeast two-hybrid screen to determine if YbeY 
interacts physically with any of these components. 
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Figure 7. Over-expression of sRNAs affect accumulation of their pre- 
dicted targets. (A and B) show the levels of over-expression of sral6 
and 35, respectively. (C, E and G) show down-regulation of sral6 
targets. (D, F and H) show the changes in putative sra35 targets. 
Sras 16 and 35 differently regulate the accumulation of common 
target, glnA (C and D). sRNA-targeting regions were predicted in the 
coding-region. 



However no significant interaction with any of these was 
observed regardless of whether YbeY was fused with either 
the GAL4 activation or binding domains (Figure 8). This 
suggests that either (i) there is no physical interaction 
between YbeY/SMcOl 1 13 with Hfq and other RNases, 
or (ii) the interaction of YbeY/SMcOl 113 with these com- 
ponents is highly transient and/or may require additional 
facilitator proteins. Transient interaction with YbeY, 
mediated by other proteins would be consistent with the 
fact that there is a large difference in the number of Hfq 
protein molecules per cell [approximately 30-60000; (82)] 
relative to YbeY/SMcOl 1 13 (approximately 1000; 
Walker, G.C., unpublished data) molecules in a cell. The 
involvement of other co-factors in modulating protein- 
protein interactions during sRNA regulation is conceiv- 
able. For example, it has recently became apparent that 
the RraA regulatory protein modulates the helicase and 
RNA-binding activities of the RNA-degradosome (83). 
Furthermore, the association of the AGO-siRNA 
complex with AGO/binding proteins e.g. Giwlp for 
correct localization has been known (84). We also 
generated a S. meliloti double mutant of hfq and 
smc01113, which showed highly reduced growth and 
enhanced doubling time over both the single mutants 
(Figure 8). As discussed more fully below, these 
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observations are consistent with SMc01113/YbeY and 
Hfq playing functionally related roles in sRNA regulation 
but with YbeY playing certain Hfq-independent roles (81) 
as well. 



DISCUSSION 

Taken together, our results suggest that an important, 
previously unrecognized function for the SMc01113/ 
YbeY proteins, members of the UPF0054 family which 
is present in most bacterial species, is to participate in 
sRNA regulation. The SMcOl 1 13/YbeY proteins show 
intriguing structural similarities to the MID domain of 
the AGO proteins, which has a capacity to capture the 
5'-phosphate of sRNA seed regions. Loss of SMc01113 
results into perturbed sRNA accumulation, deregulated 
mRNA levels and consequently, physiological and symbi- 
otic defects. Furthermore, there are striking similarities in 
the symbiotic, physiological and molecular profiles of the 
smc01113 and hfq mutants. Taken together, our results 
suggest that SMcOl 113 homologs may form a component 
of sRNA-pathway and facilitate effective outcomes of the 
sRNA-mRNA interaction: in particular, we suggest that 
SMcOl 1 13/YbeY proteins could help in recognizing the 
prokaryotic sRNAs during their Hfq mediated interaction 
with the target mRNAs. In addition, this study also gives 
insights into the fundamentals of sRNA-mediated regula- 
tory basis of stress adaptation in S. meliloti by identifying 
multiple sRNA-target interactions. Although sRNAs have 
been identified in a few genomic studies of 5. meliloti 
(16-18), little is known about their functions, targets 
and regulation. Understanding regulatory mechanisms in 
S. meliloti are of utmost importance from agricultural, 
economical, environmental perspectives as well as for 
their implications for other symbiotic and pathogenic 
interactions. 

An estimated 82 sRNAs were originally predicted in 
S. meliloti (16-18), whereas the most recent genome-wide 
survey puts the number at 1125 [which included trans- 
encoded sRNAs, antisense RNAs, putative mRNA 
leader and sense transcripts, which may be degradation 
products of mRNAs; (85)]. Genome-wide searches in 
E. coli estimate that the number of sRNAs is ~2% of 
the number of protein coding genes (77,86). In bacteria, 
a major class of these sRNAs, as in eukaryotes, act by 
base pairing with the target mRNAs. In response to en- 
vironmental stress, the expression of several sRNAs may 
be stabilized by the RNA chaperone Hfq, which facilitates 
the ?ra«s-annealing of the sRN As to their target mRNA in 
an anti-sense manner (70,87). This is followed by the rec- 
ognition of the sRNA-target complex by a multi- 
ribonucleoprotein complex, resulting in mRNA degrad- 
ation (11,75,88) or mRNA stabilization, facilitated ribo- 
some assembly and enhanced translation (10,13,64,69). 
However, these processes are still poorly understood, 
as little is known regarding the assembly of the sRNA 
mRNA complex, except that Hfq is involved (11,13,81). 

Conceptually, striking parallels exist between the pro- 
karyotic and the eukaryotic sRNA-pathways in terms of 
synthesis of sRNAs, the presentation of sRNA and 



mRNAs onto the protein scaffold and the eventual 
outcomes of sRNA-mRNA interactions (13). Both, the 
sRNAs and the mRNAs bind to the Hfq protein, but at 
different sites (87). However, mechanistically, little is 
known about how the prokaryotic sRNAs are recognized 
and loaded on to the Hfq protein scaffold during their 
interaction with the target mRNAs. This is better under- 
stood in eukaryotes, where the si/miRNAs are loaded into 
AGOs, and then guided and assembled on to the targets 
by the RISC. The recognition of target is governed by the 
'seed sequence' of the guide RNA strand associated with 
the MID domain of the AGO proteins that generates 'an 
enhanced affinity anchor site promoting fidelity in target 
recognition' (89). This stabilizes and guides the assembly 
of the active RISC (89). The structure of the MID domain 
associated with the dsRNA and ssDNA reveals that the 
MID-domain recognizes the 5'-phosphate specifically and 
anchors sRNAs on to the AGO/RISC (32,52). The 5'-end 
of the guide strand (of sRNA) could bind to the AGO in a 
stacked helical confirmation, both in the absence and 
presence of the partner mRNA strand, facilitating easy 
access for recognition of the mRNA target (32,52). 
Moreover, the MID domain has been implicated in 
protein-protein interactions: interactors such as Tas3 in 
S. pombe form a so-called 'Ago-hook' that binds the 
MID domain (90). Whether the binding of the protein 
interactors and sRNAs occur simultaneously, or, is 
mutually exclusive remains unclear. Different AGO 
family members have diverse functions in a given 
organism (91). The MID domains are central to the func- 
tional distinctions that define the various AGO protein 
family members (92). Proteins that perform analogous 
functions in the rhizobial or other bacterial sRNA- 
pathways have remained unknown. Our structural 
analysis studies suggest that SMcOl 11 3/ YbeY and their 
bacterial homologs might perform analogous functions 
of anchoring the seed sRNAs, which can help the sRNA 
access the mRNA. SMcOl 1 13/YbeY proteins have >50% 
sequence similarity to MID domains, and furthermore 
have similar structural folds to those of both the prokary- 
otic and eukaryotic MID domains [Figure 1; (52)]. 
Moreover, our structural analysis suggests that they simi- 
larly have a capacity to bind to the sRNA seeds, and 
regulate accumulation of sRNAs and target mRNAs 
(Figures 1, 3-5). Our analyses suggest that, similarly to 
the corresponding residues of the MID domain, R59 
and K61 may undergo key interactions with a phosphate 
at 5'-end of an RNA sequence. The structure of YbeY 
suggests that it could accommodate an internal phosphate 
at the 5'-end of the seed sequence instead of requiring that 
the 5'-phosphate be at the 5'-terminus as well as a 
5'-phosphate at the terminus of a sRNA. This suggests 
that SMcOl 1 13/YbeY homologs have a function that is 
analogous to the functions of AGO MID domain, 
facilitating the recognition of the sRNA, thereby guiding 
it to the target mRNA/Hfq and the assembly of active 
complex. 

YbeY differs from the AGO MID domain in one inter- 
esting aspect; a metal coordinated ion, postulated to be 
zinc (27), is present on the other side of the cleft opposite 
R59 and K61 (Supplementary Figure SID), along with a 
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highly conserved His-triad (HI 14, HI 18 and H124; 
Supplementary Figure SI). These could act catalytically 
in a hydrolytic reaction. In contrast to speculations of 
YbeY being a peptidase or a protease (28), our docking 
of the RNA and analysis of YbeY structure suggests that 
it is plausible that a phosphodiester bond could be located 
close enough to the presumed coordinated water for hy- 
drolysis to occur. Thus, it is possible that YbeY could 
contribute catalytically, like an RNase, to RNA cleavage 
events after binding to a guide RNA. We hypothesize a 
model where a sRNA substrate would bind to the posi- 
tively charged surface of YbeY with help of a phosphate 
that plausibly interacts with K59, L61, L63 and other 
residues on the positively charged surface (Figure 1, 
Supplementary Figure SI) followed by a metal- 
coordinated hydrolysis of phosphodiester bond by the 
His-triad (Supplementary Figure SI). Such metal- 
dependent RNA-binding and RNase activity is established 
for nucleases for instance in V. vulnificus and Xenopus 
laevis (93,94). Indeed, the 5'-binding pocket in AGO, 
present at the junction of the MID and PIWI domains 
[the two domains now regarded as 'MID/PIWI lobe'; 
(95)], involves a metal ion coordinated to the C-terminal 
carboxylate of the AGO polypeptide and the first (5') and 
third phosphates of the guide strand (95). It is conceivable 
that YbeY family proteins may have been evolutionarily 
like primitive AGO proteins, where catalysis and RNA 
binding was present in one structure. With the evolution 
of complex genetic switches, the catalytic capacity got 
separated into more, RNaseH-like PIWI of AGOs, 
whereas MID got specialized for RNA binding and 
protein-protein interaction functions. Indeed, YbeY 
shares some structural similarities with PIWI (e.g. for 
PIWI of 2NUB AGO, two structures superimpose with 
a significant RMSD of 3.06 A; data not shown). 

Our proposal that SMcOl 1 13/ YbeY proteins participate 
in the sRNA-pathway strengthens the parallels between 
the AGO/RISC functions and the bacterial sRNA- 
mRNA assembly process that have been previously 
noted (13,81). Furthermore, as in the eukaryotes, our 
computational analysis suggests a 'one-to-many' sRNA- 
target relationship in S. meliloti. Thus, this study gives the 
first line of evidence that supports more than a conceptual 
similarity (13) between the prokaryotic and eukaryotic 
RNAi machinery. The involvement of other analogous 
facilitator proteins (currently unknown), however, 
cannot be ruled out. The next step will be to work out 
the biochemical properties of the SMcOl 1 13/ YbeY protein 
so that we can better understand the mechanism of such 
action. 

For most of the phenotypes we tested, along with the 
underlying molecular targets and sRNAs, the smc01113 
and hfq mutants displayed close similarity. Nevertheless, 
certain observations indicate that SMcOl 113 and Hfq also 
act independently. The difference in thtR levels in 
smc01113 and hfq mutants is consistent with this possibil- 
ity, as is the reduced growth of smc01113 hfq double 
mutant compared to the single mutants in a rich LB-MC 
medium. Recent work from Rasouly et al. (28) and from 
our lab in E. coli rRNAs (51) suggests an additional 
function for YbeY as the deletion of ybe Y from E. coli 



results in defects in ribosome assembly and activity as well 
as in attenuation of processing of ends of rRNAs (51). 
Interestingly, mutating the R59 residue, which our 
docking studies suggest may participate in coordinating 
the P0 4 in a sRNA seed (Figure 1E-G) on a multicopy 
plasmid did not affect rRNA processing but did affect the 
growth rate. The possibility that a member of sRNA- 
pathway could potentially be involved in maturation of 
rRNA-ends is not at all surprising. For example, RNase 
E, central to sRNA pathway (11,13,77,81), is also neces- 
sary for maturation of ends of rRNA (96,97). In another 
example, PNPase is indispensable for 3'-end maturation of 
23S rRNA, is a component for mRNA degradation is 
sRNAs' context, and is also a key regulator of sRNAs, 
(81,98). Even Hfq interacts with at least 24 ribosomal 
proteins (81), and also with RNase E and PNPase (81); 
in E. coli, YbeY shows genetic interactions with RNase E 
and PNPase (51). This overlap in function and sharing of 
components between sRNA-pathway and rRNA matur- 
ation and ribosome assembly may form a more general 
strategy of efficient post-transcriptional gene regulation 
by coupling translation and transcription (81). This 
warrants further investigation. 

Our observations are also important because they verify 
the differential expression of genes in S. meliloti hfq 
proteome at the transcriptional level. Along with that, 
we have also identified several sRNAs that are Hfq- 
dependent. Mapping of these differentially accumulated 
sRNAs/mRNAs, the conservation of sRNA-mRNA com- 
plementary motifs in rhizobial species closely related to 
S. meliloti (Figure 6), and change in accumulation of 
predicted targets in bacteria over-expressing sRNAs 
(Figure 7) indicate functions for these sRNAs, a possible 
mode of regulation of these genes and a mechanistic rela- 
tionship between the sRNAs and the targets. We have here 
further generated additional support for biological rele- 
vance of some 5. meliloti sRNA-mRNA relationships 
(Figure 7) and identified other such candidate relation- 
ships for further investigation. 

Our results also offer insights into the regulatory basis 
of stress adaptation in S. meliloti. Pathways such as trans- 
port system, oxidative stress resistance and nitrogen 
metabolism are coordinated and controlled by the 
sRNA-machinery. Along with identifying a new compo- 
nent, this study places the sRNA-pathway squarely in the 
centre of stress adaptation and regulation of plasticity of 
S. meliloti as the bacteria undergoes ecosystem shifts 
during its course of survival. 
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